Sotho phonology

Sesotho

Notes:

  • All examples marked with are included in the audio samples. If a table caption is marked then all Sesotho examples in that table are included in the audio samples.
  • The orthography used in this and related articles is that of South Africa, not Lesotho. For a discussion of the differences between the two see the notes on Sesotho orthography.
  • Hovering the mouse cursor over most italic Sesotho text should reveal an IPA pronunciation key (excluding tones). Note that often when a section discusses formatives, affixes, or vowels it may be necessary to view the IPA to see the proper conjunctive word division and vowel qualities.

The phonology of Sesotho and those of the other Sotho–Tswana languages are radically different from those of "older" or more "stereotypical" Bantu languages. Modern Sesotho in particular has very mixed origins (due to the influence of Difaqane refuges) inheriting many words and idioms from non-Sotho–Tswana languages.

There are in total 39 consonantal phonemes[1] (plus 2 allophones) and 9 vowel phonemes (plus two close raised allophones). The consonants include a rich set of affricates and palatal and postalveolar consonants, as well as three click consonants (alternatively, one click pronounced with three accompaniments).

Contents


Historical sound changes

Probably the most radical sound innovation in the Sotho–Tswana languages is that the Proto-Bantu prenasalized consonants have become simple stops and affricates.[2] Thus isiZulu words such as entabeni on the mountain, impuphu flour, ezinkulu the big ones, ukulanda to fetch, ukulamba to become hungry, ukuthenga to buy, etc. are cognates to Sesotho thabeng, phofo, tse kgolo, ho lata, ho lapa, and ho reka (with the same meanings).

This is further intensified by the law of nasalization and nasal homogeneity, making derived and imported words have syllabic nasals followed by homogeneous consonants, instead of prenasalized consonants.

Another important sound change in Sesotho which distinguishes it from almost all other Sotho–Tswana languages and dialects is the chain shift from /x/ and /k͡xʰ/ to /h/ and /x/ (the shift of /k͡xʰ/ to /x/ is not yet complete).

In certain respects, however, Sesotho is more conservative than other Sotho–Tswana languages. For example, the language still retains the difference in pronunciation between /ɬ/, /t͡ɬʰ/, and /tʰ/.[3] Many other Sotho–Tswana languages have lost the fricative /ɬ/, and some Northern-Sotho languages, possibly influenced by Tshivenda, have also lost the lateral affricate and pronounce all three historical consonants as /tʰ/ (they have also lost the distinction between /t͡ɬ/ and /t/ — thus, for example, speakers of the Northern Sotho language commonly called Setlokwa call their language "Setokwa").[4]

The existence of (lightly) ejective consonants (all unvoiced unaspirated stops) is very strange for a Bantu language and is thought to be due to Khoisan influence. These consonants occur in the Sotho–Tswana and Nguni languages (being over four times more common in Southern Africa than anywhere else in the world), and the ejective quality is strongest in isiXhosa, which has been greatly influenced by Khoisan phonology.

As with most other Bantu languages, almost all palatal and postalveolar consonants are due to some form of palatalization or other related phenomena which result from a (usually palatal) approximant or vowel being "absorbed" into another consonant (with a possible subsequent nasalization).

The Southern Bantu languages have lost the Bantu distinction between long and short vowels. In Sesotho the long vowels have simply been shortened without any other effects on the syllables; while sequences of two dissimilar vowels have usually resulted in the first vowel being "absorbed" into the preceding consonant, and causing changes such as labialization and palatalization.

As with most Southern African Bantu languages, the "composite" or "secondary" vowels *e and *o have become /ɛ/ and /e/, and /ɔ/ and /o/. These usually behave as two phonemes (conditioned by vowel harmony), although there are enough exceptions to justify the claim that they have become four separate phonemes in the Sotho–Tswana languages.

Additionally, the first-degree (or "superclose", "heavy") and second-degree vowels have not merged as in many other Bantu languages, resulting in a total of 9 phonemic vowels.

Uniquely among the Sotho–Tswana languages, Sesotho has adopted a click sound that is pronounced with three accompaniments (tenuis, aspirated, and nasalized). It most probably came with loanwords from the Khoisan and Nguni languages, though it also exists in various words which don't exist in these languages and in various ideophones.

This click also appears in certain situations which are rare or non-existent in the Nguni and Khoisan languages, such as a syllabic nasal followed by a nasalized click (nnq in nnqane that other side), a syllabic nasal followed by a tenuis click (also written nq in senqanqane frog; this is not the same as the prenasalized radical click written nkq in the Nguni languages), and a syllabic nasal followed by an aspirated click (nqh in seqhenqha hunk).

Vowels

Sesotho has a large inventory of vowels compared with many other Bantu languages. However, the nine phonemic vowels are collapsed into only five letters in the Sesotho orthography. The two close vowels i and u (sometimes called "superclose" or "first-degree" by Bantuists) are very high (with ATR) and are better approximated by French vowels than English vowels.

Vowels[5]
/i/ /u/
ho bitsa to call beet tumo fame boot
/ɪ/ /ʊ/
ho leka to attempt pit potso query put
/e/ /o/
ho jwetsa to tell cafe pontsho proof oiseau
/ɛ/ /ɔ/
ho sheba to look bed mongolo writing board
/ɑ/
ho abela to distribute spa

Consonants

The Sotho–Tswana languages are peculiar among the Bantu family in that most do not have any prenasalized consonants and have a rather large number of heterorganic compounds. Sesotho, uniquely among the recognised and standardised Sotho–Tswana languages, also has click consonants acquired from the Khoisan and Nguni languages.

  Labial Alveolar Post-
alveolar
Palatal Velar Uvular Glottal
central lateral
Click glottalized       ǃˀ        
aspirated       ǃʰ        
nasal       ᵑǃ        
Nasal stop m n     ɲ ŋ    
Plosive ejective          
aspirated          
voiced b (d)1            
Affricate ejective   tsʼ tɬʼ tʃʼ        
aspirated   tsʰ tɬʰ tʃʰ   kxʰ / x    
Fricative voiceless f s ɬ ʃ     h ~ ɦ
voiced       ʒ /      
Approximant     l   j w    
Trill             ʀ  
  1. [d] is an allophone of /l/, occurring only before the close vowels (/i/ and /u/). Dialectical evidence shows that in the Sotho–Tswana languages /l/ was originally pronounced as a retroflex flap [ɽ] before the two close vowels.

Sesotho makes a three-way distinction between lightly ejective, aspirated and voiced stops in several places of articulation.

Plosives
Place of articulation IPA Notes Orthography Example
bilabial // unaspirated: spit p pitsa cooking pot
//   ph phuputso investigation
/b/ this consonant is fully voiced b lebese milk
alveolar // unaspirated: stalk t botala greenness
//   th tharollo solution
[d] an allophone of /l/, only occurring before the close vowels (/i/ and /u/) d Modimo God
velar // unaspirated: skill k boikarabelo responsibility
// fully aspirated: kill; occurring mostly in old loanwords from Nguni languages and in ideophones kh lekhokho the part of the pap that remains baked to the pot after cooking

Sesotho possesses four simple nasal consonants. All of these can be syllabic and the syllabic velar nasal may also appear at the end of words.

Nasals
Place of articulation IPA Notes Orthography Example
bilabial /m/   m ho mamaretsa to glue
/m̩/ syllabic version of the above m mpa stomach
alveolar /n/   n lenaneo programme
/n̩/ syllabic version of the above n nna I
palatal /ɲ/ as Spanish el niño ny ho nyala to marry
/ɲ̩/ syllabic version of the above n nnyeo so-and-so
velar /ŋ/ can occur initially ng lengolo letter
/ŋ̩/ syllabic version of the above n ho nka to take

The following approximants occur. All instances of /w/ and /j/ most probably come from original close /ʊ/, /ɪ/, /u/, and /i/ vowels or Proto-Bantu *u, *i, *û, and *î (under certain circumstances).

Note that when w appears as part of a syllable onset this actually indicates that the consonant is labialized.

Approximants
Place of articulation IPA Notes Orthography Example
labial-velar /w/   w sewa epidemic
lateral /l/ never occurs before close vowels (/i/ and /u/), where it becomes [d] l selepe axe
/l̩/ a syllabic version of the above; note that if the sequence ll is followed by the close i or u then the second l is pronounced normally, not as a d l mollo fire
palatal /j/   y ho tsamaya to walk

The following fricatives occur. The glottal fricative is often voiced between vowels, making it barely noticeable.[6] The alternative orthography used for the velar fricative is due to some loanwords from Afrikaans and ideophones which were historically pronounced with velar fricatives, distinct from the velar affricate. The voiced postalveolar affricative sometimes occurs as an alternative to the fricative.

Fricatives
Place of articulation IPA Notes Orthography Example
labiodental /f/   f ho fumana to find
alveolar /s/   s Sesotho
postalveolar /ʃ/   sh Moshweshwe Moshoeshoe I
/ʒ/   j mojalefa heir
lateral /ɬ/   hl ho hlahloba to examine
velar /x/   kg. Also g in Gauta and some ideophones such as gwa ("of extreme whiteness") sekgo spider
glottal /h/ these two sounds are allophones h ho aha to build
/ɦ/

There is one trill consonant. Originally, this was an alveolar rolled lingual, but today most individuals pronounce it at the back of the tongue, usually at the uvular position. The uvular pronunciation is largely attributed to the influence of French missionaries at Morija in Lesotho. Just like the French version, the position of this consonant is somewhat unstable and often varies even in individuals, but it generally differs from the "r"'s of most other South African language communities. The most stereotypical French-like pronunciations are found in certain rural areas of Lesotho, as well as some areas of Soweto (where this has had an impact on the pronunciation of Tsotsitaal).

Trill
Place of articulation IPA Notes Orthography Example
uvular /ʀ/ soft Parisian-type r r moriri hair

Sesotho has a relatively large number of affricates. The velar affricate, which was standard in Sesotho until the early 20th century, now only occurs in some communities as an alternative to the more common velar fricative.[7]

Affricates
Place of articulation IPA Notes Orthography Example
alveolar /t͡sʼ/   ts ho tsokotsa to rinse
/t͡sʰ/ aspirated tsh ho tshoha to become frightened
lateral /t͡ɬʼ/   tl ho tlatsa to fill
/t͡ɬʰ/ occurs only as a nasalized form of hl or as an alternative to it[3] tlh tlhaho nature
postalveolar /t͡ʃʼ/   tj ntja dog
/t͡ʃʰ/   tjh ho ntjhafatsa to renew
/d͡ʒ/ this is an alternative to the fricative /ʒ/ j ho ja to eat
velar /k͡xʰ/ alternative to the velar fricative kg kgale a long time ago

The following click consonants occur.[8] In common speech they are sometimes substituted with dental clicks. Even in standard Sesotho the nasal click is usually substituted with the tenuis click. nq is also used to indicate a syllabic nasal followed by an ejective click (/ŋ̩ǃkʼ/), while nnq is used for a syllabic nasal followed by a nasal click (/ŋ̩ǃŋ/).

Clicks
Place of articulation IPA Notes Orthography Example
postalveolar /ǃkʼ/ ejective q ho qoqa to chat
/ᵑǃ/ nasal; this is often pronounced as an ejective click nq ho nqosa to accuse
/ǃʰ/ aspirated qh leqheku an elderly person

The following heterorganic compounds occur. They are often substituted with other consonants, although there are a few instances when some of them are phonemic and not just allophonic. These are not considered consonant clusters.

In non-standard speech these may be pronounced in a variety of ways. bj may be pronounced /bj/ (followed by a palatal glide) and pj may be pronounced /pjʼ/. pj may also sometimes be pronounced /ptʃʼ/, which may alternatively be written ptj, though this is not to be considered standard.

Heterorganic compounds
Place of articulation IPA Notes Orthography Example
bilabial-palatal /pʃʼ/ alternative tj pj ho pjatla to cook well
/pʃʰ/ aspirated version of the above; alternative tjh pjh mpjhe ostrich
/bʒ/ alternative j bj ho bjarana to break apart (like a clay pot)
dentilabial-palatal /fʃ/ only found in short passives of verbs ending with fa; alternative sh fj ho bofjwa to be tied

Syllable structure

Sesotho syllables tend to be open, with syllabic nasals and the syllabic approximant l also allowed. Unlike almost all other Bantu languages, Sesotho does not have prenasalized consonants (NC).

  1. The onset may be any consonant (C), a labialized consonant (Cw), an approximant (A), or a vowel (V).
  2. The nucleus may be a vowel, a syllabic nasal (N), or the syllabic l (L).
  3. No codas are allowed.

The possible syllables are:

Note that heterorganic compounds count as single consonants, not consonant clusters.

Additionally, the following phonotactic restrictions apply:

  1. A consonant may not be followed by the palatal approximant y (i.e. Cy is not a valid onset).[9]
  2. The labio-velar approximant w (or a labialized consonant) may not followed by a back vowel at any time.

Syllabic l occurs only due to a vowel being elided between two l's:

*molelo (Proto-Bantu *mu-dido) ⇒ mollo fire (cf Setswana molelo, isiZulu umlilo)
*ho lela (Proto-Bantu *-dida) ⇒ ho lla to cry (cf Setswana go lela, isiXhosa ukulila, Tshivenda u lila)
isiZulu ukuphuma to emerge ⇒ ukuphumelela to succeed ⇒ Sesotho ho phomella

There are no contrastive long vowels in Sesotho, the rule being that juxtaposed vowels form separate syllables (which may sound like long vowels with undulating tones during natural fast speech).[10] Originally there might have been a consonant between vowels which was eventually elided that prevented coalescence or other phonological processes (Proto-Bantu *g, and sometimes *j).

Other Bantu languages have rules against vowel juxtaposition, often inserting an intermediate approximant if necessary.

Sesotho Gauteng ⇒ isiXhosa Erhawudeni

Phonetic processes

Vowels and consonants very often influence one another resulting in predictable sound changes. Most of these changes are either vowels changing vowels, nasals changing consonants, or approximants changing consonants. The sound changes are nasalization, palatalization, alveolarization, velarization, vowel elision, vowel raising, and labialization. Sesotho nasalization and vowel-raising are extra-strange since, unlike most processes in most languages, they actually decrease the sonority of the phonemes.

Tonology

Sesotho is a tonal language spoken using two contrasting tones: low and high; further investigation reveals, however, that in reality it is only the high tones that are explicitly specified on the syllables in the speaker's mental lexicon, and that low tones appear when a syllable is tonally under-specified. Unlike the tonal systems of languages such as Mandarin, where each syllable basically has an immutable tone, the tonal systems of the Niger–Congo languages are much more complex in that several "tonal rules" are used to manipulate the underlying high tones before the words may be spoken, and this includes special rules ("melodies") which, like grammatical or syntax rules that operate on words and morphemes, may change the tones of specific words depending on the meaning one wishes to convey.

Stress

The word stress system of Sesotho (often called "penultimate lengthening" instead, though there are certain situations where it doesn't fall on the penultimate syllable) is quite simple. Each complete Sesotho word has exactly one main stressed syllable.

Except for the second form of the first demonstrative pronoun, certain formations involving certain enclitics, polysyllabic ideophones, most compounds, and a handful of other words, there is only one main stress falling on the penult.

The stressed syllable is slightly longer and has a falling tone. Unlike in English, stress does not affect vowel quality or height.

This type of stress system occurs in most of those Eastern and Southern Bantu languages which have lost contrastive vowel length.

The second form of the first demonstrative pronoun has the stress on the final syllable. Some proclitics can leave the stress of the original word in place, causing the resultant word to have the stress at the antepenultimate syllable (or even earlier, if the enclitics are compounded). Ideophones, which tend to not obey the phonetic laws which the rest of the language abides by, may also have irregular stress.

There is even at least one minimal pair: the adverb fela (only) has regular stress, while the conjunctive fela (but) (like many other conjunctives) has stress on the final syllable. This is certainly not enough evidence to justify making the claim that Sesotho is a stress accent language, though.

Because the stress falls on the penultimate syllable, Sesotho, like other Bantu languages (and unlike many closely allied Niger–Congo languages), tends to avoid monosyllabic words and often employs certain prefixes and suffixes to make the word disyllabic (such as the syllabic nasal in front of class 9 nouns with monosyllabic stems, etc.).

Notes

  1. ^ Other authors may choose to include the labialized consonants as contrastive phonemes, potentially increasing the number by 26 to 75. Labialization does create minimal pairs, as is exemplified by the short passive sufffix, but different authors seem to be divided on whether or not these should be counted as authentic phonemes (especially since Sotho–Tswana-type labialization caused by vowel "absorption" is a fairly strange and rare process).

    Besides the passives, there are still numerous minimal pairs differing only in the labialization of a single consonant (note that each of the following pairs has similar tonal patterns):

    -rala design, versus -rwala wear certain clothing (shoes, socks, gloves, hats, etc); carry (a load) on the head
    -lala lie down (old fashioned or poetic), versus -lwala be sick (old fashioned)
    mora son, versus morwa a Khoisan person
    -hama milk an animal, versus -hwama (of fat) congeal
    -tshasa smear, versus -tshwasa capture prey with the intention of killing it
    mohla day, versus mohlwa termite(s)

    Normal consonants and their labialised forms do not contrast before back vowels (that is, a labialized consonant will lose its labialization before a back vowel).

  2. ^ The Sotho–Tswana ejective plosives /pʼ/, /tʼ/, and /kʼ/ come from the Proto-Bantu *mb, *nd, and *ŋg due to the radical effects of the nasalization process. The Proto-Bantu stops *p, *t, and *k have usually become /f/, /r/, and /x/ (/ʀ/ and /h/ in modern Sesotho) with * becoming [fu], and the nasalized forms of these (Proto-Bantu *mp, *nt, and *ŋk) are the two aspirated plosives /pʰ/ and /tʰ/, and the aspirated velar affricate /k͡xʰ/ (/x/ in most Sesotho speaking communities).

    Note that some Sotho–Tswana languages do have prenasalized consonants, or at least have less strict and varied nasalization rules, but this is almost certainly as a result of influence from neighbouring non-Sotho–Tswana languages.

  3. ^ a b Strictly speaking, /t͡ɬʰ/ should be an allophone of /ɬ/ found only when /ɬ/ is nasalized. However, possibly due to the mixed origins of Sesotho, there are several instances of /t͡ɬʰ/ appearing without nasalization (as is the case in Setswana) or of /ɬ/ failing to nasalize when the nasalizing consonant is not visible (such as when forming polysyllabic class 9 nouns).

    Thus one finds:

    ho hlaha to emerge/be born ⇒ class 9 tlhaho nature
    ho hlompha to respect/honour ⇒ class 9 hlompho respect

    where the nasalization is applied in the first noun but not the second.

  4. ^ A further collapse occurred in Silozi — which has lost the generally unusual distinction between plain and aspirated consonants. Thus Sesotho /ɬ/, /t͡ɬʼ/, /t͡ɬʰ/, /tʼ/, and /tʰ/ all map to the single Silozi phonome /t/.
  5. ^ Note that the IPA symbols used for the near-close vowels in this and related articles are different from those often used in the literature. Often the symbols /ɨ/ and /ʉ/ are used instead of the standard /ɪ/ and /ʊ/, but these two symbols represent the close central unrounded vowel and the close central rounded vowel respectively in the modern IPA.
  6. ^ There are many historical instances in Sesotho which show an occasional confusion between the phonemes /j/, /ɦ/, and (no consonant). For example, the verb -aha (build) often appears as -haha (cf. Silozi -yaha), though comparison with other languages (Setswana -aga, Nguni -akha, etc.) reveals its true form.

    Other examples include the changing of the original verbal "focus marker" *-ya- to -a-; the second person singular objectival concord (-o-, but Setswana -go- and Nguni -ku-); the verb -laya (correct, give a person what they deserve, instruct; its Proto-Bantu form *-dag- should have given -laa, which does occur as a variant); verbs which end in the form -iya (e.g. -siya leave behind, -diya cause to fall, etc.) being alternatively rendered as -ia; lee (egg; Proto-Bantu *di-gi) often appearing as lehe; etc. It should also be noted that many verbal derivatives treat verbs ending with -ya as if they end with -a (that is, the suffix replaces the entire -ya, not just the final -a).

  7. ^ In Setswana and most Northern Sotho languages these are two different phonemes. The Setswana velar fricative corresponds to the Sesotho glottal fricative, and the velar affricate corresponds to the Sesotho velar fricative/affricate, but before the close vowel u Setswana regularly uses the unvoiced glottal fricative.
  8. ^ For completeness, this table uses a narrower (more detailed) transcription of clicks than usual in Bantu languages, but the rest of this article and other articles in the series use the less detailed system of click transcription. See the full consonant table above to see the usual transcriptions.
  9. ^ Historically, in various Bantu languages, this has resulted in palatalization (giving the postalveolar and palatal consonants) and the alveolar fricative s.
  10. ^ This is not to say that the glottal stop is part of the phoneme inventory of Sesotho, nor is it correct to say that the language has diphthongs or triphthongs (or even longer: ha o a e utlwa You did not hear it). Sequences of vowels may be pronounced with hiatus (thus they are not diphthongs), but in fast speech they may simply flow into each other (thus the glottal stop is not a contrastive phoneme).
  11. ^ Historically kg was an affricate /k͡xʰ/ (this still appears as a variation) and was therefore not an exception.

    Some individuals nasalize kg and h to kh (possibly by analogy with the Setswana hu nasalizing to khu) and sometimes even k (perhaps due to the unstable nature of the voiced h, which is barely audible and may cause the syllable to sound as if it does not have an onset). Though this is certainly not to be considered standard, it is an understandable reaction to the frication ("weakening") of the affricate /k͡xʰ/.

  12. ^ Strangely, there are no polysyllabic verbs beginning with y. The verb -ya cannot be used with an objectival concord (it may have an intransitive, locative, or instrumental import and an idiomatic passive, but is not transitive) and the approximant is removed in verbal derivations. There are also no adjectives beginning with y or any other parts of speech which may be nasalized, so there are no instances of y being nasalized.

    Note that if a y were to nasalize by getting a k in front of it, the phonotactic restrictions and phonetic rules of the language would not allow the combination *ky. In Silozi, which has many verbs commencing with y (many of which correspond to Sesotho vowel verbs), nasalization of y results in c (/t͡ʃ/), which has collapsed from original Sotho–Tswana j, tj, and tjh. Since nasalization removes voicing and frication (and Sesotho palatalization preserves aspiration), one may then deduce that if Sesotho y were to nasalize it would most probably become tj.

  13. ^ This second change is very strange and does not occur in most other major Sotho–Tswana languages.
  14. ^ a b The symbols used in this and related articles for the raised allophones of the near-close vowels are non-standard, though there really aren't any standard alternatives...

    The difficulty lies in acknowledging the role of ATR in this process. In the past, when they were recognised at all, they were often viewed as simply an extra vowel height, and the choice of symbols differed between authors since standard IPA does not recognise the possibility of so many contrastive close vowel heights.

  15. ^ In Sesotho, when a consonant is followed by a vowel, the shape of the lips is changed to resemble the shape of the vowel while the consonant is being pronounced (or even before, when the syllable is the first after a pause) with the shaping being more severe the higher the vowel height. Thus, when a consonant is followed by a back vowel the lips are rounded when pronouncing the consonant, and the lips are spread when pronouncing a consonant followed by a front vowel. Labialization may be explained by saying that, for some reason, the lips are rounded in anticipation of a back vowel that is never pronounced. This also explains why labialization disappears before back vowels. Since the lips will already be rounded anyway in anticipation of the following vowel, there is no way to distinguish between a labialized consonant before a back vowel and a normal consonant before a back vowel (this is similar to the situation in English where /hw/ — written as wh — is pronounced /h/ in words such as whom, whole, and whore). Note that it is also possible for labialization to simply disappear, even if any other modification of the consonant caused as a side-effect of labialization remains. One example is the tentative evolution of modern Sesotho ntja (dog) from Proto-Bantu *N-bua:
    Proto-Bantu *N-bua ⇒ (nasal homogeneity) *m̩bua ⇒ (labialization) *m̩bʷa ⇒ (palatalization) *m̩pʃʷa ⇒ (loss of labialization + gaining of ejective quality) *m̩pʃʼa (as found in Northern Sotho) ⇒ (heterorganic simplification + nasal homogeneity) modern [ɲ̩t͡ʃʼɑ]

References